{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cross domain image transfer has been a hot topic and various models have been used to acheive this task. Recenetly, realising the power of GAN networks people have started making model based on GANs and the results are quite impressive. In this blog, we will try to understand the recent development in this direction by Kim et. al https://arxiv.org/pdf/1703.05192v1.pdf."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "There are various components of transferring an image from one cateogry to another category. So let us first look at these things in detail\n",
    "\n",
    "1.) Extracting the features of an image (Decoding): So, first of all what do we mean by features of an image. Suppose we want to convert an image of man to a look-alike woman, first, we need to understand various general things about the human face like this thing is nose, or ear, or eyebrows or hair etc. These are some high level features but there are some low level features like edges and curves. Using these low level features only we recongise these high level features which help use break down the image and understand the details of it. Model that is used to extract these features almost in all neutral network is call Convolution Network which you can read in detail about in this paper: [Link to nice blog post on Convolution]\n",
    "\n",
    "\n",
    "\n",
    "2.) Learning transformation from the feature of one model to that of another model (Transfer): After first step, we have high and low level features of an image and we would like to convert these details to that of another category. Let us look at one of these transformatin that we need to learn. Suppose we detected that a man's face has some beard and we like to convert this face to that of a woman then we need to remove the beard and smothen out that part of face to match with color of face. Another example might include thick eyebrows to thin eyebrows and so on.\n",
    "\n",
    "\n",
    "3.) Making the final image (Encoding): This step can bee send as the reversal of step 1. Here, we will look at different features of an image and construct a face out of it.\n",
    "\n",
    "\n",
    "Things need to keep in mind. We will not learn high level features like this is an ear or this part is nose because these are very high level features but maybe like this the upper part of nose or like this is left part of the left eyebrow and so on. Basically some mid level features.\n",
    "\n",
    "\n",
    "So, now that we have understood the key elements of an image transformation, let use see how they model this using neural network. Once, you are clear with these steps and have basic knowledge of Deep Learning you are good to go to make the model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Extracting the features:\n",
    "\n",
    "For the basic about convolution network you can go through this very intuitive blog post by []. So the first step is extracting 64 very low level features (using a window of [7,7]).This can easily be done via a conv layer as follow:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "o_c1 = general_conv2d(input, 64, 7, 7, 1, 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we can keep on extracting further features and since we dont want to blow up the size of image, we will use stride of 2. It will prevent overlapping as well as reduce the size of output. So, we further extract features as follow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "o_c2 = general_conv2d(o_c1, 64*2, 3, 3, 2, 2)\n",
    "o_c3 = general_conv2d(o_c2, 64*4, 3, 3, 2, 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, as you can see we are extracting more and more high-level features from the previous layer of low-level features moving towards extracting very high level features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we can assume that after these step we will have fairly decent amount of features that we would like to transform to a woman's face. So we will build the transformation layers as follow. For these transformation we would like to maintain the same number of features, so the number of features of intermediate output of intermediate layers will be same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
